Computing fluxes and chemical potential distributions in biochemical 
networks: energy balance analysis of the human red blood cell 

Daniele De MartinoJ Matteo FigliuzziJ Andrea De Martino,^ and Enzo Marinari^ 
Wipartimento di Fisica, Sapienza Universita di Roma, P.le A. Mom 2, Roma, Italy 
^CNR/IPCF, Dipartlmento di Fisica, Sapienza Universita di Roma, P.le A. Mom 2, Roma, Italy 

The analysis of non-equilibrium steady states of biochemical reaction networks relies on finding the configu- 
rations of fluxes and chemical potentials satisfying stoichiometric {mass balance) and thermodynamic {energy 
balance) constraints. Efficient methods to explore such states are crucial to predict reaction directionality, cal- 
culate physiologic ranges of variability, estimate correlations, and reconstruct the overall energy balance of the 
network from the underlying molecular processes. While different techniques for sampling the space generated 
by mass balance constraints are currently available, thermodynamics is generically harder to incorporate. Here 
we introduce a method to sample the free energy landscape of a reaction network at steady state. In its most gen- 
eral form, it allows to calculate distributions of fluxes and concentrations starting from trial functions that may 
contain prior biochemical information. We apply our method to the human red blood cell's metabolic network, 
whose space of mass-balanced flux states has been sampled extensively in recent years. Specifically, we profile its 
thermodynamically feasible flux configurations, characterizing in detail how fluctuations of fluxes and potentials 
are correlated. Based on this, we derive the cell's energy balance in terms of entropy production, chemical work 
done and thermodynamic efficiency. 
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I. INTRODUCTION 

The dynamics of a chemical reaction network is ruled by 
the underlying thermal fluctuations through the Arrhenius 
law that relates the rates of reactions to the activation ener- 
gies and the strength of noise (the temperature). When non- 
equilibrium steady states (NESS) are reached, the net flow of 
reactions is constrained to proceed downhill in the free en- 
ergy landscape [1 , 2|. Many aspects involved in the analysis 
of biochemical networks (like the cellular metabolism of liv- 
ing organisms) at stationarity hinge on the explicit inclusion 
of thermodynamic constraints into the network model; among 
them, the assignment of reaction directions (and in turn the 
calculation of feasible flux configurations), the assessment of 
metabolite producibility, the prediction of metabolite concen- 
trations and the estimation of chemical potentials. 

Steady-state schemes used to predict flux values are mostly 
based on mass-balance constraints only, and the way in which 
thermodynamics is included could significantly impact their 
results. For Flux-Balance-Analysis (FBA) |l3]|4l, where mass- 
balanced flux configurations are collapsed into a single opti- 
mal solution that maximizes a pre-determined objective func- 
tion |5, 6 1, the removal of thermodynamic inconsistencies by 
additional energetic constraints has been proven to be use- 
ful to estimate concentrations and reaction affinities besides 
fluxes IT UTTI . In absence of clear optimization criteria or, 
more generally, to retrieve global information on the feasi- 
ble network states allowed in given extracellular and intracel- 
lular conditions (e.g. physiologic ranges of variability), it is 
instead important to characterize the space of flux configura- 
tions compatible with mass- and energy-balance constraints in 
a statistically robust way, as allowed e.g. by Monte Carlo (for 
small networks lfT2l ) or message-passing (for larger networks 
ifTSl ) methods for mass-balance equations. 

Here we propose a scalable technique to obtain refined in- 



formation on the distribution of fluxes, chemical potentials 
and intracellular concentrations for non-equilibrium steady 
states, as well as predictions for correlations and reaction di- 
rections. The method is based on perceptron learning, extends 
the sampling procedure used in |14| to explore the space of 
flux states compatible with minimal stability constraints a la 
Von Neumann ifTSJI and uses the stoichiometric matrix and the 
experimentally determined potentials in standard conditions 
of a set of metabolites as its basic input. In essence, it gen- 
erates feasible configurations of fluxes and concentrations by 
exploiting the input information and a suitably chosen update 
dynamics to build-up correlations between chemical poten- 
tials and fluxes. We apply it to the control case of the model of 
the human red cell (hRBC) metabolism reconstructed in ||T2j, 
for which we derive and analyze the viable state space and the 
emerging correlations. This ultimately allows to characterize 
the cell as a chemical engine. In particular, we shall appraise 
its chemical energy balance in the standard terms of entropy 
production, work done and thermodynamic efficiency. 



II. BACKGROUND 

A metabolic reaction network is defined by a matrix S en- 
coding the stoichiometric coefficients ^^ of the compounds 
jj. - 1, . . . , M in the reaction / = 1, . . . , A^. Conventionally, 
positive (resp. negative) stoichiometric indices characterize 
the products (resp. substrates) of the forward reaction. In- 
takes (resp. outtakes) are characterized as having only positive 
(resp. negative) stoichiometry. Neglecting molecular noise 
and dilution due to expanding cell volume, the concentration 
c^ of metabolite ^ evolves according to 



c^ = J]^fy,(c,k,...) 



(1) 



1=1 



where v, is the net flux of reaction /, which depends in princi- 
ple on the concentration vector c, on a vector of reaction con- 
stants k, as well as on other factors like enzyme availability 
and kinetics, transport details, etc. The usual working hypoth- 
esis is to assume stationarity: in this case the flux vector v 
satisfies the M linear mass-balance equations (MBE) 



Ev = 0. 



(2) 



Bounds of the type v™" < v, < v™" are also usually specified. 
These include both physical considerations, like an assign- 
ment of reversibility by thermodynamic arguments, and func- 
tional aspects, e.g. the fact that a certain reaction should op- 
erate above a given flux threshold in order to provide a phys- 
iological function. Cellular metabolic networks usually have 
N > M so that the system (pi is underdetermined, leading to 
a solution space of dimension D - N - rank(S). Uniform 
sampling is computationally affordable only for small enough 
D (a few tens), e.g. via Monte Carlo methods llT2l . However 
if the interest is focused on finding flux configurations that 
are optimal with respect to specific biological functionalities 
(e.g. biomass production), the solutions of Q can be further 
constrained to maximize an objective function. This is the 
standard framework of FBA, as implemented with consider- 
able success to describe for example optimal bacterial growth 
in wild type and knock-out conditions fTS'-TSl . 

In a different approach one leaves the solution space func- 
tionally unconstrained while studying the metabolite produc- 
tion profiles compatible with a given extracellular medium. 
For any choice of the environment (i.e. of the intakes, the out- 
takes being left unspecified), a metabolite fi is producible if 
there exists a flux vector v such that 2,- ^Cv, > 1.19,1 . Feasible 
production profiles are then described by the solutions of 



Sv>0 



(3) 



The resulting vector y = Sv encodes the information on 
whether each metabolite is producible (y^ > 0) or not (y'^ = 0) 
in a given feasible flux state. Producible metabolites might 
either be employed in other macromolecular processes or be- 
come cellular outtakes, so that "objective functions" can in 
this way emerge as statistically robust production profiles 
II2OI . Eq. [3]is Von Neumann's flux stability constraint (VNC) 
for production networks ETl and simply states that for each 
chemical species at stationarity the overall consumption can- 
not exceed the total supply. The inequality that distinguishes 
Q from (|2]) allows for the efficient sampling of its solution 
space even for genome-scale networks of the size of E. coli, 
where D is of the order of a few hundreds lfT4l . 

Going beyond the flux problem, the second law of ther- 
modynamics dictates that in a NESS the reaction directions 
Si = sign(v,) must be related to the chemical driving forces 
(affinities) AG, by 



SiAGi < Vi 



(4) 



where the equality only holds if / is in equilibrium. The free 
energy changes AG, can be written in terms of the chemical 
potentials g^ (the Gibbs free energy per mole of the species //) 



M 



AG,- = ;^^y 



(5) 



The existence of non-trivial states of chemical equilibrium 
(trivial states are the ones with g^ - for each ju) im- 
plies M > rank(E). The thermodynamic constraints (CI I 



are usually implemented a priori, by pre-assigning reaction 
reversibilities based on the estimation of chemical potentials 
in physiologic conditions |22, 23|. In other words, the flux 
of a reaction classified as irreversible in the forward direction 
should be sampled under the condition of being non-negative. 
In cases of mis-assignments, the flux problem can lead to in- 
consistent configurations |2|. Note that, in absence of pre- 
scribed flux bounds, given a vector z = {z,) (/ = 1,...,A^) 
such that 



X^r^-0 v^, 



(6) 



ieR\U 



where the sum extends over all reactions (set R) excluding up- 
takes (set U), for each solution v* = {v*) of the flux problem 
(pi or (pj, the vector defined as v* + kz is again a solution 
for any k e R. In essence, a thermodynamically consistent 
assignment of reaction directions reduces this degeneracy by 
bounding some degrees of freedom. A concrete example of 
this is discussed in Appendix Al. Our goal in this note is to 
devise a tool to obtain flux vectors v = {v,) and chemical po- 
tential vectors g - {§'') that are joint solutions of (pi) (or (pi) 
and(|CT]l. 



III. METHODS 

We have considered two (non-equivalent) solution schemes. 
In Method (a), the flux and the chemical potential problems 
are decoupled: first, the former (i.e. Eq. (pi or (pi) is solved 
for V for a given reversibility assignment (e.g. based on phys- 
iological data), then ( |Cl| l is solved for g using as an input the 
signs s, of the net fluxes that solve the flux problem. This pro- 
cedure allows to remove thermodynamically inconsistent so- 
lutions of the flux problem and provides estimates for chem- 
ical potentials, but it doesn't allow to predict reaction direc- 
tionalities. Method (b) instead solves the mass balance and the 
energy balance problems jointly without any prior assumption 
on reversibility. We shall begin this section by outlining the 
two algorithms; then we shall briefly discuss the algorithmic 
setup and the cellular data we have analyzed (a detailed ac- 
count of these issues is presented in Appendices A2 and A3. 



A. Method (a) Decoupling the flux and the free energy 
problems 

The solutions of the flux problems (pi and (pi (starting from 
a priori reversibility assignments) can be sampled respectively 
by Monte Carlo methods (for small networks, see e.g. IIT2I ) 
and by the MinOver^ scheme (even for large systems, see e.g. 



lfT4l El)- We assume to have obtained the resulting distri- 
butions for the net fluxes and hence a vector s - {5,) of net 
reaction directions (+1 for forward, -1 for backward, for 
bidirectional). The free energy landscape reconstruction prob- 
lem consists, given s, in sampling vectors g that satisfy (CI 1, 
namely such that 



M 



:''„'" 



>o v/ 



(7) 



To approach it, we note that the problems dTJi and (J3]l are for- 
mally equivalent. By analogy with |15|, we can then employ 
the MinOver scheme originally designed to analyze percep- 
tron learning |25| and later adapted to generate solutions of 
(Is}. Let Po(g) = n^i K(8'') denote a trial probability dis- 
tribution of chemical potentials that contains some a priori in- 
formation on experimentally determined potentials and con- 
centrations. For instance, F^ could be a uniform distribution 
centered around the known experimental values of g^ and of 
sufficiently large width to span several orders of magnitude in 
trial concentrations, or a flat unbiased uniform distribution. In 
essence, the algorithm is designed to modify Pq by building 
up correlations between g^'s, until a distribution matching the 
solution space of (JTl) is achieved. The steps are as follows. 

1. Generate a chemical potential vector g = {g''} from 
^o(g). 

2. Compute x = {jc,) from (J7]i and /q = arg min, x,-. 

3. If jc,„ > then g is a thermodynamic ally consistent 
chemical potential vector for s; exit (or go to 1 to ob- 
tain a different solution). 



4. If x,-„ < 0, update g as 



g" ^ g^-as,,^l 



(8) 



(where a > is a constant), go to 2 and iterate until 
convergence. 

As is generally true in MinOver schemes, the reinforcement 
term in ([8]l drives the gradual adjustment of potentials by en- 
suring that, at every iteration, the least satisfied constraint (la- 
beled /o) is improved. In particular, the chemical potentials of 
metabolites that are produced (resp. consumed) in reaction /q 
are decreased (resp. increased) at each time step until a state 
is achieved where all unidirectional fluxes descend in the free 
energy landscape. Note that even though the constraint (JTl) is 
absent if i, = 0, the above method still allows to retrieve infor- 
mation about the chemical potential of a metabolite involved 
in reversible processes (unless it is only involved in reversible 
processes, in which case its g^' is never updated). Convergence 
to a solution (if any) is guaranteed for any a > 0, the proof 
being a minor modification of the one shown in fTSl, and the 
space of feasible chemical potentials can be sampled by re- 
starting the process from different chemical potential vectors 
belonging to Po(g)- In this way, the final outcome is a set of 
correlated probability distributions for the §'"s. Note that, at 
odds with the method proposed in |9|, the final chemical po- 
tentials can exceed the bounds defined by Po(g). Details about 



the implementation of this scheme (e.g. the choice of a) are 
described in Appendix A3. 



B. Method (b) Joint solution of the flux and free energy 
problem 

Method (b) aims at solving the flux and the free energy 
problems simultaneously without relying on a priori infor- 
mation on reaction reversibility (i.e. all are initially assumed 
to be bidirectional). Besides a trial distribution Poig) for the 
chemical potentials, it uses the chemical potentials ^^^^, of the 
metabolites subject to uptakes. We focus on the case where the 
flux problem is represented by the VNC ([3]). From a compu- 
tational viewpoint, sampling the solution space of p} is done 
more conveniently by introducing a parameter p > such that 
the system ([3]) is recovered f or p — > 1 (see e.g. lfT4l [TSll ). 
In a fully reversible setting, the VNC takes a simple form by 
writing S = A - B where A. - [cl'] and B = {//') denote, 
respectively, the matrices of output and input stoichiometric 
coefficients, and by re-defining the flux variable v, as 

V; = Si^ii , where si = sign(v,) and 0; = |v,| . (9) 

Introducing the shorthand 

^>, V,) = 0(v,)« - pZt^) + 0(-y,)(^ - p<) (10) 

one sees 1261 that the solutions of ([3]l correspond to those of 

N 

fip)^Yj^i(p,Vi)<Pi>0 V;u, (11) 

1=1 

for p — > 1 . The algorithm proceeds as follows. 

1. Initialize p = p < 1 (e.g. p = 0). 

2. Generate a chemical potential vector g - [g^'] from 
A)(g). 

3. Assign reaction directions 

3A. For intracellular reactions: compute affinities as 
AG, - Yjfi^S''' ^nd assign directions as s,- = 
-sign(AG,) 

3B. For intakes: if g^" < ^ then s, = 1 (intake is 



, = (mactive). 



"') 



active), otherwise s 



4. Generate a vector ^ = {0,) from a uniform distribution, 
e.g. in [0, 1]. {(pi is the absolute values of flux v,, see 

©■) 



{yip)] from (111 and //q 



5. Compute y(p) 
arg min^/(p). 



6. If 3^°(p) > 0, then the configuration (v, g) with v, = 5,0, 
is a thermodynamically feasible configuration of fluxes 
and chemical potentials for p = p ; go to 8. 



7. If /"(p) < 0, then update as 

<Pi ^ (l>',^4>i+PC (12) 

with/3 > a constant. If sign(0j) < 0, then update g as 

g" ^ / + «< (13) 

and re-assign directions as in step 3. Let {s\] denote the 
updated directions. Then 

• M s'j - Si, then set 0,- - and go to 5 (iterate until 
convergence) 

• If s'. - -Si, then set 0,- = \(p'.\ and go to 5 (iterate 
until convergence to a solution) 

8. If p < pmax = 1 - e, update p as p — > p + (5 with 6 > 
and go to 5 and iterate until convergence 

The idea behind this scheme is to mimic a relaxational dynam- 
ics into a thermodynamically stable steady state. Following 
initialization, in step 3 directions are assigned according to a 



chemical potential vector. This assignment satisfies (CI I by 
definition and encodes information on the connected correla- 
tions among reaction affinities, i.e. 



{AGiAGj),.^Y,^';e.(g'^-{g'')f 



(14) 



{g^} denoting the average of g^ according to the trial func- 
tion Pq. The algorithm then performs a MinOver scheme for 
fluxes, trying to find a feasible flux configuration compatible 
with this assignment. As the dynamics proceeds, some of the 
fluxes might display a preference to revert. In these cases, we 
let the chemical potential vector slowly evolve following the 
scheme of Method (a). If this provides a new set of directions 
agreeing with the inversion, the latter is accepted, otherwise 
the reaction is shut down and the process is iterated. Once 
a configuration of fluxes and chemical potentials that satisfy 
( CI I and ( 11 1 at p = p is achieved, p is increased until p = 1 -e 
(with e the desired precision, typically 10"^ or less). Results 
at p = 1 can then easily be extrapolated, and different solu- 
tions at p = 1 can be sampled by re-starting from step 1 with 
a different vector g from Pq. In this way, flux configurations 
evolve in a thermodynamically coherent fashion till they reach 
production feasibility. During evolution, the algorithm mod- 
ifies the initial distribution of chemical potentials exploring 
states close to the initial conditions but with a dynamics that 
correlates them according to feasible flux directions. Details 
about the implementation of Method (b) are reported in Ap- 
pendix A3. 



The network includes three pathways, namely glycolysis (re- 
actions 1-13) and the pentose phosphate pathway (reactions 
14-21), through which glucose is degraded to produce high- 
energy molecules needed to maintain osmotic pressure, trans- 
membrane potential and redox state, plus a nucleotide salvage 
pathway (reactions 22-32). The functional part of the network 
lies in the three reactions (33-35) that are performing chem- 
ical work, namely: the ATPase pump, that maintains osmotic 
pressure and transmembrane potential by exchanging Na+ and 
K^ ions with the surrounding plasma; the NADHase pump, 
carrying out the reduction of oxidized heme groups; and the 
NADPHase pump, whose function is to reduce the glutathione 
(GSH) molecules that continuously get oxidized while acting 
against free radicals. 

The main reason for focusing on this network lies in the fact 
that for this case it is possible to carry out a full comparison 
with previous results obtained by sampling the solution space 
of both Q [[T2| and ([3]l |24|, as well as with experimental es- 
timates of concentration profiles and chemical potentials. The 
information on net reaction directions required for Method (a) 
has been extracted from these studies (see Section S3). As for 
the trial distribution of chemical potentials Po(g) = Y\fi Poig^) 
required by both Methods (a) and (b), we have considered two 
different cases (explained in detail in Section S3). 

(i) Poor input information. By virtue of (Cli a certain 
amount of a priori information on the g^'s (to be incorporated 
in Po) is required in order to reconstruct the entire free energy 
landscape. Case (i) reproduces a situation in which this prior 
knowledge roughly matches the necessary minimum. Specif- 
ically, the chemical potentials of 8 metabolites, namely the 
four sources that are present in the hRBC network, i.e. glucose 
(GLC), lactate (LAC), NH3 and CO2, plus inosine monophos- 
phate (IMP), adenine (ADE), NAD and NADP, is fixed in A, 
and the corresponding variables are not updated. For the other 
compounds we have chosen a P{J that reproduces the over- 
all statistics of chemical potentials: specifically, each g*^ (in 
units of KJ/mol) is selected independently and uniformly in 
[0, 2000] with probability p = 0.2 and in [-5000, 0] with 
probability 1 - p = 0.8. 

(ii) Rich input information. In this case Po(g) includes 
knowledge on the chemical potential of metabolites whose in- 
tracellular concentrations c^ are experimentally known (with 
errors), extracted from the formula g^ - ^q+ RT log c'^. How- 
ever, contrary to case (i) no chemical potential is fixed with the 
exception of that H2O, assuming water to be in a condensed 
phase. When we were unable to find reliable concentration 
estimates, the trial distribution of log c'^ was taken to be uni- 
form, centered in -4 and spanning four values symmetrically 
around the mean. 



C. The hRBC metabolic network 

We shall apply our schemes to the cellular metabolic net- 
work of the hRBC, reconstructed in |12|. The detailed re- 
construction as well as the auxiliary data employed for this 
study are exposed in Section S2. It consists of 35 intracel- 
lular reactions among 39 metabolites subject to 12 uptakes. 



IV. RESULTS 

A. Method (a), poor Input information 

Results for the chemical potential landscape of the hRBC 
reconstructed by Method (a) from poor input information and 
from the reaction directions obtained in 1121 and I.24J for 
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FIG. 1 Chemical potentials of metabolites in the hRBC metabolic 
network computed by Method (a) with poor input information: ex- 
perimental estimates (black markers) and values obtained for MBE 
(red markers) and VNC (green markers) direction assignments. The 
initial spread of chemical potentials encoded in Pq (spanning either 
the [0, 2000] or the [-5000, 0] range) is not shown. The errors on the 
experimental estimates of the fixed chemical potentials fall within 
the size of the dots. 



MBE and VNC respectively are shown in Fig.fT] Both mass- 
balance conditions turn out to provide thermodynamically fea- 
sible configurations of reaction directions and in both cases 
one obtains a broad agreement between computed and ex- 
perimentally measured chemical potentials. This shows that 
Method (a), apart from being able to verify the thermody- 
namic feasibility of a flux state, is able to retrieve information 
on the energy landscape. Keeping in mind that MBE represent 
tighter constraints on fluxes than VNC and that the networks 
used in |[T2]| and 1241 are not identical (specifically in uptakes), 
one sees that the directions extracted from Q slightly outper- 
form those resulting from Q in predicting chemical poten- 
tials, while both fail to predict (via Method (a)) the g^^'s of 
key metabolites like NADH and NADPH. The magnitude of 
the error bars reflects the considerable initial uncertainty we 
have assumed in the input information, as prior knowledge in 



this case is the minimal needed to solve (CI i unambiguously. 



B. Method (a), rich input information 

A much refined prediction is obtained using a rich input 
information. Results are displayed in Fig. [2] The computed 
affinities display significant changes with respect to the pic- 
ture embedded in Pq. For instance, several of the initial 
bounds on affinities indicate that the free energy change in a 
reaction is positive (e.g. PGI, PGK, G6PDH, TA), due either 
to the actual experimental estimates for the affinities or to the 
initial uncertainty we place on concentrations. Such bounds 
are altered by MinOver in a direction compatible with physi- 
ologic assignments (see Table S2), suggesting that the imple- 
mentation of Method (a) coiTelates the fluctuations of chem- 
ical potentials. At the same time, we observe that in some 
cases fluctuations of g'' exceed the initial boxes defined by 
Pq. Since the resulting ranges are compatible with the intake 
configuration, they may provide an estimate of the statistical 
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FIG. 2 Predictions obtained for the hRBC network by Method (a) 
with rich input information. Top: free energy changes: input infor- 
mation with error (black markers) and values obtained for MBE (red 
markers) and VNC (green markers) direction assignments. Bottom: 
measured metabolite (log-)concentrations: input information with er- 
ror (black markers) and values obtained for MBE (red markers) and 
VNC (green markers) direction assignments. 



fluctuations from cell to cell. This also allows to obtain an 
estimate for the concentration range, including for metabo- 
lites whose level has not been experimentally probed (6PGL, 
RL5P X5P R5P S7P). Our predictions for the levels of (1,3)- 
diphosphoglycerate ((1,3)-DPG), 2 -phosphogly cerate (2PG) 
and phosphoenolpyruvate (PEP) differ from the experimental 
estimates. This is a consequence of the fact that we are forcing 
the phosphoglycerate kinase (PGK) and the glyceraldehyde 
phosphate dehydrogenase (GAPDH) reactions in the forward 
direction, in agreement with the steady state direction assign- 
ments for glycolysis, even if the experimental values would 
classify them as reversible. In addition, we obtain different 
levels of key metabolites like ATP and inorganic phosphate, 
while our predictions for ADP and AMP fail in the MBE and 
VNC conditions, respectively. This can be partially traced 
back to the difficulties inherent in assessing the potentials of 
highly exchanged metabolites [23 1. On the other hand, the 
experimental estimates of these concentrations vary consider- 
ably across the literature (e.g. Il27l vs ESI ). Finally, a net in- 
take of phosphate groups and a net outtake of CO2 is predicted 
both for the MBE and the VNC states even if in physiological 
conditions the ratio of internal concentrations and the exter- 
nal levels in the blood would suggest otherwise (see Tables 
SI and S3). This, together with a slight inconsistency in the 
pH level, seems to call for the inclusion in the network of the 
carbonic anhydrase (CA-I) reaction, as well as of a bicarbon- 
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FIG. 3 Predictions obtained for the hRBC network by Method (b) 
with rich input information. Top: free energy changes: input infor- 
mation with error (black markers) and values obtained for MBE (red 
markers) and VNC (green markers) direction assignments. Bottom: 
measured metabolite (log-)concentrations: input information with er- 
ror (black markers) and values obtained for MBE (red markers) and 
VNC (green markers) direction assignments. 



ate (HCO3 ) outtake. The former carries out the intracellular 
pH buffering by hydrating CO2 into bicarbonate, while the 
latter releases HC07 through the BAND3 membrane protein, 



which can cover up to 25% of the membrane surface 
So, even if in the standard biochemical state species that dif- 
fer only by the hydration level are usually lumped together in 
the reactants list, we have chosen to treat carbon dioxide and 
bicarbonate as different metabolites in the implementation of 
Method (b). 



C. Method (b), rich input information 

Results obtained by Method (b) (without a priori assump- 
tions on reversibility) are displayed in Figs. [3] and [4] Be- 
cause the ATPase pump turns out to be weakly active across 
the solution space (as in |24|), we have performed a second 
set of calculations forcing a strong flux through it (compa- 
rable with the GLC uptake, as in|12|). In the former case, 
compared to the Monte Carlo sampling of lfT2l two more re- 
actions are found to be effectively bidirectional in agreement 
with their standard assignment, i.e. PGK and lactate dehydro- 
genase (LDH). Note that when the former is operating back- 
ward, it activates a futile cycle with the Rapoport-Luebering 




FIG. 4 Marginal distributions of fluxes for the hRBC network ob- 
tained by Method (b) with rich input information under strong (red 
lines) and weak (black lines) ATPase flux. The reference unit 
for fluxes is fixed to the measured value of GLC uptake, namely 
4 • IQ-'^mol/s. 
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FIG. 5 Pearson coefficients of the fluxes obtained by Method (b) 
with rich input information under weak ATPase flux. 



shunt, a behavior that has been experimentally observed in 
acidic conditions |30|. When ATPase flux is large, instead, 
PGK is constrained in the forward direction. On the other 
hand, reverse LDH implies a LAC intake and a higher flux of 
NADHase. Simple biochemical arguments (see Section S5) 
suggest that the physiologically relevant scenario for hRBCs 
is that where the ATPase flux is much smaller than the GLC 
uptake. We shall henceforth focus on this case (corresponding 
results for strong ATPase flux are reported in Sec. S4). 

The pairwise correlations among fluxes (Fig. E\ show 
that glycolysis and the pentose-phosphate pathway form tight 
modules that are weakly anti-correlated with each other, 
mostly through the reversible phosphoglucoisomerase (PGI) 
reaction. The forward (resp. backward) direction of PGI in- 
deed indicates that glucose is processed preferentially through 
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FIG. 6 Pearson coefficients of the chemical potentials obtained by 
Method (b) with rich input information under weak ATPase flux. 



glycolysis (resp. pentose phosphate pathway). Glycolysis can 
furthermore be broken down into two separated blocks with 
a weaker (positive) cross-correlation. The nucleotides sal- 
vage pathway forms instead a weakly correlated module. The 
network ultimately presents six bidirectional reactions (PGI, 
PGK, LDH, R5PI, ApK, LACe). Comparing Figs. |4]and|5] 
one sees that the directions of PGK, LDH and PGI govern the 
state of ATPase, NADHase and NADPHase, respectively. For 
instance, the negative (resp. positive) part of the LDH distri- 
bution corresponds to the high (resp. low) flux state of the 
NADHase pump (LDH and NADHase are strongly anticorre- 
lated). 



between k/ and kj for / ^ j and between g(J. and kj (for each fi 
and /) this reduces to 



(Z^"). 



^£^r^X, (^^v) 



(16) 



This simply tells us that the dynamics tends to correlate (resp. 
anti-correlate) metabolites typically appearing on the same 
(resp. opposite) side of the reaction equations (such as ADP 
and ATP). 

In this respect, it is worthwhile to compute the amount of 
information gained (or lost) through the sampling algorithms 
essentially via the build-up of correlations. Note that indeed, 
since input distributions can be dynamically broadened, a loss 
of information is actually possible. The rationale behind this 
flexibility is that we are interested in estimating the ranges of 
variability of single cell states, which may exceed the uncer- 
tainty derived from experimental measures that are typically 
performed by averaging over a large number of cells. Now the 
information gain is related to the difference in entropy of the 
initial and final distributions, which is particularly simple to 
estimate assuming to be dealing with Gaussian distributions. 
Indeed the entropy of a Gaussian 2(g) is given by OTl 

S[Q(g)] = ^tr[log(K)] + f [1 + log(27r)] (17) 

where K is the covariance matrix of the g'^'s and M is the 
dimensionality of the space, i.e. the number of metabolites. It 
follows that 

1 '^ 
I = -(S [Pig)] - S [Poig)]) = - ^[log^^o) - log(^'')] ' (18) 



D. Chemical potential correlations 



We hinted above that the coupling of (J3]l and ( CI i generates 
non trivial correlations between the chemical potentials in fea- 
sible states in both Methods (a) and (b). A quantitative look 
at this aspect is reported in Fig. l6] Note that correlations were 
initially absent, since Poig) is assumed to be a product of un- 
correlated independent distributions. The algorithmic origin 
of such interdependecies can be understood considering that 
chemical potentials are updated dynamically through a series 
of reinforcement steps of the form as/^. It follows that the 
g'^'s can ultimately be written as g'' - g^^. + aY^jki^, where 
^. is the trial chemical potential sampled from Pq and ki is 
an index which is updated (increased or decreased by one ac- 
cording to the sign of the reaction) each time reaction / tries to 
invert. The covariance between chemical potentials can thus 
be decomposed as 



where A'^ and M' are respectively the eigenvalues of the covari- 
ance matrix for the initial and final distributions, is the infor- 
mation gain. Surprisingly, the information gain thus computed 
is found to be negative (/ ^ -3.98) for Method (a), and pos- 
itive (/ ^ 4.12) for Method (b), indicating that the thermody- 
namic sampling algorithm allows for a refinement of the input 
information. The rank plot of eigenvalues (see Fig. S5) shows 
clearly that Method (a) is capable of extracting information 
only from the metabolites with largest K eigenvalues (associ- 
ated with highly uncertain concentrations) and is generically 
outperformed by Method (b). 

In order to spot key-metabolites whose concentration vari- 
ability bears a strong influence on the global organization of 
fluxes, we can quantify the correlations between fluxes and 
concentrations emerging from Method (b) by computing the 
index 



;^- 



<(v/ - <v,))(g^ - {gn)) 



(19) 
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(g'^g^)c = SftvCf^+a y ^1{g'^ki}c+a y ^]{^^ki)c+a y ^^^J(fc;^y)yhich correlates the fluctuations in chemical potentials to 

those in fluxes. This matrix is shown in Fig. [7] High concen- 
trations of (1,3)-DPG and GA3P appear to favor the glycolytic 
pathway over the pentose-phosphate pathway, whereas the nu- 
cleotide rescue pathways is activated by high levels of INO 



(15) 

where {.)c stands for the connected correlation, crl is the vari- 
ance of P^, so that {^^gl^)c - 6f,ycrj^. Neglecting correlations 




hemoglobin (2% of the total). We obtain 



FIG. 7 Correlations between chemical potentials and fluxes (see 
l |19[ l) obtained by Method (b) with rich input information under weak 
ATPase flux. 



and R5P and low levels of HX and RIP, whose concentra- 
tion is instead correlated with high flux through the pentose- 
phosphate group. Note that concentrations of metabolites only 
appearing in far from equilibrium reactions (like (2,3)-DPG) 
are seldom modified by the algorithm and thus appear to be 
particularly stable, as the corresponding ;tf ^re weakly depen- 
dent on fluxes. 



E. Chemical energy balance 

The network nodes that are more tightly connected to the 
biological functionality of the haematids are the three ase 
pumps: ATPase (that regulates the internal osmotic pressure 
and the transmembrane potential), NADHase (that reduces 
hemoglobin from its metastable state) and NADPHase (that 
maintains the redox state of the cell by reducing glutathione). 
Their respective work loads can be estimated from basic bio- 
chemical parameters. 

The work carried out by the ATPase pump can be written as 



Watp = -FVo + RT 



CNai„, Ck+ 

3 log — ^-H21og- 



CNa 



^V. 



(20) 



where F ^ 96 KJ/(mol V) is the Faraday constant and Vo - 
-12 mV is the transmembrane potential, while CNa+ /cNat - 
20 and Cv* Icv* ^18 are the ratios between external and 

'^(m) ' '^(f At) 

internal levels of sodium and potassium ions. At T = 310K, 
we obtain Watp - 50 KJ/mol. 

The work caiTied out by the NADHase pump, specifi- 
cally by cytochrome b5 reductase, can instead be estimated 
from the standard values of the redox potential of the cou- 
ple Fe^^/Fe-'^ in the heme group of hemoglobin (namely 
Vi ^ 60 mV 128 J ) upon assuming normal levels of meta- 



W, 



NADH 



= -IFVi + 2RT log 



CHb 
CMetHb 



^ 10 KJ/mol 



(21) 



(The factor two comes from the fact that the reductase enzyme 
couples each NADHase with the reduction of two molecules 
of oxidized hemoglobin.) 

Finally we can estimate the work done by the NADPHase 
pump, glutathione reductase, from the standard redox poten- 
tial of the pair 2GSH/GSSG, i.e. V2 - -230mV, and from 
their concentrations measured in human red cells, given re- 
spectively by CGSH ^ 3.2 ■ 10-3 M, cgssg ^ 6.5 ■ lO^^ M |[28|. 
We find 



Wn 



= -FV2 + RT log 



Cgssg 



^ 55 KJ/mol. 



(22) 



The total amount of work per unit time performed by these 
three processes can now be written as 

W - VATPaseWATP + T^NADHase WnaDH + VNADPHase WnaDPH , 

(23) 
where VAxPase is the flux of ATPase (and similarly for the re- 
maining pumps). At the same time, the entropy produced by 
the cell per unit time reads 



TS^-Y, ^i^G, - Z «' ■ (^- -S')-^- 



(24) 



i£R\U 



iieU 



where the first sum coincides with the quadratic form de- 
fined by the stoichiometric matrix of intracellular reactions, 
i-S- Zi/,/j "^i^S^^ ^nd the second is over cross-membrane trans- 
port processes (uptakes), with u^ the uptake flux of metabolite 
//. In order to obtain a realistic description of the processes, 
after normalizing our computed fluxes with respect to the av- 
erage value of the GLC uptake (m ^ 4 ■ lO"*" mol/(l s) ll32l ). 
we scaled them by the cell volume (VhRBC - 90 fl). The naive 
thermodynamic efficiency of the hRBC can finally be evalu- 
ated as 



W 



J] 



W + TS 



(25) 



In Fig. 11 we display the distributions of W,TS,U -W + TS 
and 77 obtained from Method (b) with weak and strong ATPase 
flux. The two peaks in the distributions mirror those appear- 
ing in the flux distributions of the NADH and NADPH pumps. 
Remarkably, the computed efficiency of the hRBC is not far 
from that corresponding to optimal microbial growth, which 
was estimated to be close to 24% |33|. Note that the work 
performed per unit time by the NADPHase pump dominates 
the sum ( [23] l. Therefore the two peaks appearing in the dis- 
tribution of W miiTor those appearing in the distribution of 

^NADPHase- 



V. CONCLUSION 

Constraint-based models of cellular metabolism are impor- 
tant tools to analyze and predict the chemical activity and re- 
sponse to perturbations of cells without relying on kinetic and 
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FIG. 8 Distributions of work done (W), entropy produced (TS) and 
energy flow (f/) per unit time for the hRBC metabolism with strong 
(dotted line) and weak (straight line) ATPase flux. Inset: distribution 
of efficiency (77) in the same conditions. 



transport details that are often unavailable. In such frame- 
works, assessing the metabolic capabilities of a cell requires 
the exploration of a high dimensional space representing flux 
and chemical potential configurations compatible with mass- 
and energy-balance constraints. The complexity of the ensu- 
ing problem can in some cases be reduced by applying rea- 
sonable (though necessarily ad hoc) optimality criteria. Most 
often, however, one needs to sample "physiological configura- 
tions" from the solution space starting from a priori biochem- 
ical knowledge that could be noisy. The methods proposed 
here are designed to retrieve distributions of fluxes and chem- 
ical potentials (or concentrations) essentially by exposing and 
building up correlations between variables. This is substan- 
tially different from other approaches (e.g. ifTOl l22ll '). where 
either the flux distributions compatible with given chemical 
potentials or the reverse are sought. Disposing of reliable prior 
biochemical information is obviously a limiting factor in our 
case as well. The advantage lies in the fact that working with 
correlations allows for a greater flexibility in treating the bio- 
chemical input data. Besides providing information on feasi- 
ble physiological concentration ranges, flux distributions and 
reaction directionality, thermodynamic sampling can be em- 
ployed to evaluate the responsiveness of fluxes to fluctuations 
in concentrations (or the reverse) and to assess the thermody- 
namic efficiency of a cell. In the case considered here (the 
hRBC, where a comparison with results obtained by sampling 
the solution space of mass-balance equations is possible), we 
have estimated 77 ^ 0.24 assuming that the functional core 
of its metabolism lies in the three pumps (ATPase, NADHase, 
NADPHase). A similar calculation can be carried out on more 
complex genome-scale systems (e.g. E. coli) for a modest in- 
crease of computational costs (work in progress). 
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FIG. 9 Network module with 3 internal reactions, x, y and z between 
3 chemical species a, b and c and 2 uptake fluxes u and v. Unit 
stoichiometry is assumed. 
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Appendix A: Role of thermodynamic constraints: a 
simple example 

In order to clarify how thermodynamic constraints trans- 
late into bounds on the degrees of freedom, consider the small 
module shown in Fig. |9] A priori, it admits 2^ - 32 different 
direction assignments, corresponding to the 8 possible states 
for the 3 internal fluxes x, y and z, times the 4 possible states 
of the boundary fluxes u and v. Assuming unit stoichiome- 
try, one easily sees that 6 of the 8 states allowed for (x, y, z) 
are thermodynamically feasible, corresponding to the num- 
ber of ways in which one can order the chemical potentials of 
metabolites a, b and c. The excluded configurations present 
unfeasible cycles. Mass constraints further reduce the space 
of possible directions. Mass-balance equations (MBE) im- 
pose that each metabolite should be produced and consumed 
by at least one reaction, leaving only two feasible direction 
assignments. The softer Von Neumann conditions (VNC) in- 
stead force each metabolite to be produced at least by one 
reaction, leaving four possible states. Once the possible direc- 
tions of the boundary fluxes u and v are considered, one ends 
up with two thermodynamically and stoichiometric ally feasi- 
ble configurations for MBE and eight for VNC (down from 
32). 

Now let us focus on a specific flux model (say MBE) and 
note that it enforces u - v, x - y and v - x + z- This leaves 
two free parameters, e.g. v and x - z - A, and it is easy to 
show that the exclusion of unfeasible cycles implies \A\ < u. 
Similar though more lengthy arguments can be formulated for 
the Von Neumann flux model VNC. 

Further examples are discussed e.g. in 171 1341 . 



Appendix B: hRBC data 

The hRBC metabolic network employed for this study is 



shown in Fig. 10 It is formed by 40 metabolites (listed in Ta- 



10 




10\ 11 ^12 

-^ 2Pe |< s[pEp] tjPYR^ 4*^0 



Appendix C: Details on the implementation of Methods 
(a) and (b) 

Here we discuss in some detail four aspects connected to 
the implementation of Methods (a) and (b), namely 

• the minimal amount of input information on chemical 
potentials (to be included in Poig)) needed to recon- 
struct the free energy landscape via Method (a); 

• the form of the input information on chemical potentials 
(i.e. of i^o(g)) in the poor versus rich input information 
scenarios; 

• the form of the input information on reaction directions 
sampled from MBE and VNC, required by Method (a) 
only; 

• the choice of the "learning parameters" a and /3 that ap- 
pear, respectively, in both methods and in Method (b) 
only, that characterize the size of the update step in, re- 
spectively, chemical potentials and fluxes. 



FIG. 10 The hRBC network employed in this study. 



1. Minimal input information on chemical potentials 

To begin with, let us note that the thermodynamic constraint 



sig 



ble [l]| interacting through 35 intracellular reactions (listed in 
Table|n|i and subject to 12 (for Method (a)) or 13 (for Method 
(b)) uptakes. (Bicarbonate outtake was only accounted for in 
Method (b).) The network can be divided in three main path- 
ways, namely glycolysis (reactions 1-13), the pentose phos- 
phate pathway (reactions 14-21), and a nucleotide salvage 
pathway (reactions 22-32). The directions displayed corre- 
spond to the standard physiological assignment. The network 
coincides with the reconstruction presented in 1 12], except for 
the inclusion of bicarbonate (HCO3 ), of the carbonic anhy- 
drase reaction (33) and of a bicarbonate uptake. Table [I] also 
provides the standard chemical potential for each metabolite 
and an estimate of the intracellular concentration (when avail- 
able). Potentials are computed in the standard biochemical 
state, i.e. in acqueous solution at fixed temperature, pres- 
sure, ionic strength and pH and are available in Ii22l . where 
they are calculated according to the prescriptions of ll35l l36l 
at r = 298 K, f = 1 atm, pH = 7.6 and ionic strenght I = 0.15 
M. The physiological conditions of the hRBC are known to be 
slightly different (T - 310 K, pH - 7.2), but such differences 
do not affect significantly the results of our procedures. The 
estimated intracellular concentration ranges are obtained from 
measurements in different settings. It is worth to notice that 
the experimental errors on such values of concentrations re- 
flect an essential uncertainty rather than statistical fluctuations 
from cell to cell, since standard measurements of concentra- 
tions are usually carried out averaging over many (10^ - 10*^) 
cells. 



n(v,)X^r^'^ 



>0 



(CI) 



Finally, in Table III we report the reference values for blood 
tests l,37J of the level of metabolites of interest in serum. 



imposes that the chemical potential of metabolites that are 
sources (resp. sinks) of the network should be known, since 



the constraints (CI 1 do not bound the corresponding ^'"s from 
above (resp. below). Clearly, then, computing the landscape 
of chemical potentials is feasible only if Po(g) carries some 
prior information on the ^'''s. 

Besides network 'leaves', one easily sees that certain intra- 
cellular potentials should be known as well. Consider a chem- 
ical potential vector g* = (g^) that satisfies (CI I and note that 
g*. + kA is again a solution for k eR provided A - {A^"} is such 
that Yjff=i '^^^l - for each /. This degeneracy is related to 
the existence of equilibrium states and needs to be lifted. Ide- 
ally this can be achieved by fixing the chemical potentials of 
at least one of the metabolites with /l'^ ^ for each vector A of 
the type described above. Note that such vectors include (but 
are not limited to) the conserved metabolic pools defined in 
B9I . A conserved pool is a group P of metabolites described 
by a vector f = {{^} with £^ > if jj e P, and zero other- 
wise, such that, for each /, 2fli f^^*^ = 0. From a physical 
viewpoint, each such pool corresponds to a conservation law 
for the aggregate concentration of the corresponding metabo- 
lites and the existence of one pool suffices to force y'^ = 
for each fi e P in the VNC scenario (in other words, metabo- 
lites belonging to conserved pools can not be producible) |50|. 
The simplest way to rule out equilibrium solutions is to clamp 
the chemical potential of a group P' of metabolites (including 
network sources and sinks) such that the problem 



£r^r = ;^r^r + ;^^'^^f = o v/ 
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Abbr. 


Compound name 


go [KJ/mol] 


f [M] 


GLC* 


Glucose 


-387 


5 ± 1 ■ 10-3 |27j 


G6P 


Glucose-6-phosphate 


-1281 


4 ± 1 ■ 10-5 |:27j 


F6P 


Fructose-6-phosphate 


-1278 


1.3 ±0.5 -10-5 |27| 


FDP 


Fmctose- 1 ,6-diphosphate 


-2171 


2.7± 1- 10'^ 1271 


DHAP 


Dihydoixyacetone phosphate 


-1070 


1.7 ±0.1 -10-5 (39) 


GA3P 


Glyceraldehyde-3-phosphate 


-1078 


5.7± 1- lO"** 03 


13DPG 


1 ,3-Diphosphoglycerate 


-2191 


1 ± 0.5 ■ 10-'^ (40i 


23DPG 


2,3-Diphosphoglycerate 


-2240 


4 ± 3 ■ 10-3 |4oj 


3PG 


3-Phosphoglycerate 


-1332 


4 ± 2 ■ 10-5 (391 


2PG 


2-Phosphoglyceiate 


-1326 


1.4 ±0.5 -10-5 (39) 


PEP 


Phosphoenolpymvate 


-1181 


1.7 ±0.2- 10-5 (27) 


PYR' 


Pyruvate 


-341 


8 ± 6 ■ 10-5 |39j 


LAC* 


Lactate 


-297 


1.4 ±0.5- 10-3 (27) 


6PGL 


6-Phosphogluco-lactone 


-1352 




6PGC 


6-Phosphogluconate 


-1353 


5 ± 2 ■ 10-* pTl 


RL5P 


Ribulose-5 -phosphate 


-1201 




X5P 


Xylusose-5-phosphate 


-1203 




R5P 


Ribose-5 -phosphate 


-1202 




S7P 


Sedoheptulose-7-phosphate 


-1336 




E4P 


Erythrose-4-phosphate 


-1125 


5 ± 2 ■ 10-5 (42) 


PRPP 


5-Phosphoribosyl- 1 -pyrophosphate 


-2949 


5 ± 1 ■ 10-5 (38l 


IMP 


Inosine monophosphate 


-774 


3 ± 1 ■ 10-5 14311441 


RIP 


Ribose- 1 -phosphate 


-1194 


6 ± 5 ■ 10-* 


HX* 


Hypoxanthine 


274 


2 ± 1 • 10-* (381 


INO* 


Inosine 


119 


1.3 ±0.5- 10-* (381 


ADE* 


Adenine 


538 


1.3 ±0.7 -10-5 (381 


ADO* 


Adenosine 


378 


1.3 ±0.3 -10"* El 


AMP 


Adenosine monophosphate 


-514 


8 ± 1 ■ lO-'* (28l 


ADP 


Adenosine diphosphate 


-1384 


1.0 ±0.1 -10-3 (281 


ATP 


Adenosine triphosphate 


-2250 


7.9 ±0.1- 10-3 (281 


NAD 


Nicotinamide adenine dinucleotide 


1145 


7 ±2- 10-5(391 


NADH 


Nicotinamide adenine dinucleotide(R) 


1209 


around IQ-^ (ext) 


NADP 


Nicotinamide adenine dinucleotide phosphate 


260 


1.4±0.5-10-*|45:i 


NADPH 


Nicotinamide adenine dinucleotide phosphate (R) 


325 


4 ± 2 ■ 10-5 (45) 


H* 


Hydrogen ion 





10-7-2 (4g) 


Pi* 


Inorganic phosphate 


-1055 


1.0 ±0.5 -10-3 (281 


NH^ 


Ammonia 


96 


5 ± 1 ■ 10-5 |47j 


CO* 


Carbon dioxide 


-543 


1.2 ±0.5- 10-3 |43) 


H2O* 


Water 


-149 


solvent 


Hco; 


Bicarbonate 


-710 


1.6 ±0.1- 10-2 |4g:i 



TABLE I Metabolites included in the human red blood cell metabolic network used in this study, go represents the standard chemical potential 
of the metabolite, while c denotes an indicative value for the experimentally estimated intracellular concentration, when available. Compounds 
marked by an asterisk are subject to uptakes. All data were searched through the Bionumbers database L38J : pointers to explicit references are 
given when available. 



admits no solution. From a geometrical perspective, the min- 
imal number of metabolites whose chemical potentials needs 
to be constrained is that for which the number of independent 
equations in the above system exceeds the number of vari- 
ables. 



2. Poor versus rich input information 

In the case of the hRBC, there are four source/sink nodes, 
namely glucose (GLC), lactate (LAC), ammonia (NH3) and 
carbon dioxide (CO2), whereas a basis for the left kernel of 
the stoichiometric matrix E (excluding uptakes) turns out to 
be composed by seven vectors: 3 conserved pools, namely 
the pairs (NAD, NADH) and (NADP, NADPH) and the larger 
pool formed by (HX, INO, IMP, ADE, ADO, AMP, ADP, 
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Nr 


Abbr 


Enzyme 


reaction 


1 


HK 


Hexokinase 


CLC + ATP -^ GbP + ADP + H 


2 


PGI 


Phosphoglucoisomerase 


CbP « FbP 


3 


PFK 


Phosphofructokinase 


FbP + ATP^ FDP + ADP + H 


4 


ALD 


Aldolase 


FDP « CA3P + DHAP 


5 


TPI 


Triose phosphate isomerase 


DHAP « CA3P 


6 


GAPDH 


GLyceraldehyde phosphate dhydrogenase 


CAiP + NAD + Pi ^ \ 3DPG + NADH + H 


7 


PGK 


Phosphoglycerate kinase 


liDPG + ADP « 3PC + ATP 


8 


DPGM 


Diphosphoglyceromutase 


\3DPG ^ liDPC + H 


9 


DPGase 


Diphosphoglycerate phosphatase 


23DPG + H2O ^ 3PG + Pi 


10 


PGM 


Phosphoglyceromutase 


3PG « 2PG 


11 


EN 


Enolase 


2PG ^ PEP + H2O 


12 


PK 


Pymvate kinase 


PEP + ADP + H -^ PYR + ATP 


13 


LDH 


Lactate dehydrogenase 


PYR + NADH + H « LAC + NAD 


14 


G6PDH 


Glucose-6-phosphate dehydrogenase 


GbP + NADP -^ bPGL + NADPH + H 


15 


PGL 


6-phosphoglyconolactonase 


bPGL + H2O -r^ bPCC + H 


16 


PDGH 


6-phosphoglycoconate dehydrogenase 


bPGC + NADP -4 RL5P + NADPH + CO2 


17 


R5PI 


Ribose-5-phosphate isomerase 


RL5P « R5P 


18 


X5P 


Xylulose-5-phosphate epimerase 


RL5P « X5P 


19 


TKI 


Transketolase I 


X5P + R5P^S7P + GA3P 


20 


TA 


Transaldolase 


GA3P + S1P^ E4P + FbP 


21 


TKII 


Transketolase 


X5P + E4P « FbP + GA3P 


22 


PRPPsyn 


Phosphoribosyl pyrophosphate synthetase 


R5P + ATP^ PRPP + AMP 


23 


PRM 


Phosphoribomutase 


R\P « R5P 


24 


HGPRT 


Hypoxanthine guanine phosphoryl transferase 


PRPP + HX + H20^ IMP + 2Pi 


25 


AdPRT 


Adenine phosphoribosyl transferase 


PRPP + ADE + H2O -^ AMP + 2Pi 


26 


PNPase 


Purine nucleoside phosphoiylase 


lNO + Pi<^HX + RlP 


27 


IMPase 


Inosine monophosphatase 


IMP + H2O ^ INO + Pi + H 


28 


AMPDA 


Adenosine monophosphate deaminase 


AMP + H20^ IMP + NHi 


29 


AMPase 


Adenosine monophosphate phosphohydrolase 


AMP + H20^ ADO + Pi + H 


30 


ADA 


Adenosine deaminase 


ADO + H2O ^ INO + iVH, 


31 


AK 


Adenosine kinase 


ADO + ATP -^ ADP + AMP 


32 


ApK 


Adenylate kinase 


2ADP<r4ATP+AMP 


33 


CA-I 


Carbonic anhydrase 


CO2 + H2O « HCO, + H 


34 


ATPase 


Sodium-Potassium ionic pump 


ATP + H2O -^ ADP + Pi 


35 


NADHase 


Cytochrome-b5 reductase 


NADH -^ NAD + H 


36 


NADPHase 


Glutathion reductase 


NADPH -^ NADP + H 



TABLE II Intracellular reactions included in the human red blood cell metabolic network used in this study. 



Compound 


Range (M) 


GLC 


4-6- 10-3 


PYR 


3-10- 10-5 


LAC 


0.5 - 2.2 ■ 10-3 


H+ 


lO-V-3 _ 10-V.45 


Pi 


0.8- 1.5- 10-3 


NH3 


1 - 6 ■ 10-5 


CO2 


2-3-10-2 


HCO- 


1.8-2.3- 10-2 



number of carbon atoms and phosphate groups. Analyzing 



TABLE III Estimated concentration levels in blood serum. 



ATP), plus 4 other vectors whose components are related in 
a non-intuitive way to the balance of global quantities like the 



( C2 1, however, one finds that by clamping four metabolites 
(besides leaves) no equilibrium solution can exist. 

In the poor input information scenario, we have therefore 
constructed Po(g) by fixing the chemical potentials of GLC, 
LAC, NH3 and CO2 as well as of inosine monophosphate 
(IMP), adenine (ADE), NAD, and NADP to remove the de- 
generacy associated to the presence of equilibrium solutions 
(different choices for these do not alter results). For these 
metabolites, P'^ig'') = S(g^ - ^xp), where g^xp is the chemical 
potential obtained from experimental data. For the other com- 
pounds we have chosen a /^ that reproduces the overall statis- 
tics of chemical potentials. In particular, each g'' (in units of 
KJ/mol) is selected independently and uniformly in [0, 2000] 
with probability p - 0.2 and in [-5000, 0] with probability 
1-/5 = 0.8. 
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In the rich input information scenario, f (g) was constructed 
as follows. The chemical potential of metabolites whose in- 
tracellular concentrations are known experimentally were ex- 
tracted from the formula g^' - g[., + RTlogc^. Here ^ is 
the free energy of formation of the metabolite fj. in the stan- 
dard biochemical state. These values are accurately known 
for metabolites in the hRBC metabolic network (see Table [l]i. 
The concentrations c'^ are instead taken to be uniformly and 
independently distributed random variables with average val- 
ues and box sizes according to the experimental estimates re- 
ported in Table 11] The concentrations of metabolites for which 
we were unable to find reliable experimental estimates are as- 
sumed to be uniformly and independently distributed random 
variables centered around 10 '^ M and spanning four orders of 
magnitude symmetrically around the mean. Note that in this 
case no metabolite has a clamped chemical potential and the 
algorithm can modify the trial distributions of all metabolites. 

In both scenarios, the chemical potential of water is treated 
as a boundary condition, i.e. it is kept fixed since we assume 
that water is in a condensed phase. 



3. Input information for Method (a) on directions sampled 
from MBE and VNC 

In fT2l and f24l|, the hRBC network is assumed to oper- 
ate under the MBE and VNC scenarios, respectively, in both 
cases starting from prior assignments for reaction directions. 
The implementation of Method (a) makes use of the reaction 
directions obtained in these studies. In summary: 

• according to MBE, the net flux of all reactions is in the 
forward direction, except PGI, R5PI and ApK, (which 
are found to operate bidirectionally); 

• according to VNC, the net flux of all reactions is in the 
forward direction, except R5PI (which is found to be 
operating bidirectionally), and PGI and ApK (which are 
found to operate in the backward direction). 

For comparison the thermodynamic sampling Method (b) 
(which doesn't require a priori reversibility assumptions) pro- 
vides solutions in which the net flux of all reactions is in the 
forward direction, except PGI, PGK, LDH, R5PI and ApK 
(which are found to operate bidirectionally). Note that reverse 
PGK activates a futile cycle with the Rapoport-Luebering 
shunt. This behavior has been experimentally observed in 
acidic conditions ll30l . 

It should be noted however that the network used here 
presents an additional intracellular reaction (CA-I), an addi- 
tional metabolite (HCO3) and an additional uptake with re- 
spect to the networks studied above. Furthermore, the recon- 
structions employed in lfT2ll and ll24l have slight but important 
differences in the structure of uptakes. 



4. Setting the learning rates 

Both Methods (a) and (b) rely on learning rates (denoted as 
a for Method (a) and a and /? for Method (b)) that fix the size 



of the adjustment to be applied to chemical potentials (a) and 
fluxes (J3) at each iteration step. 

For the implementation of Method (a) presented here, we 
set a to the (iteration-dependent) value -2x,-„/ 2/j(^f )^- One 
easily sees that if we let all the chemical potentials evolve 
during the sampUng, this value of a has the net effect of re- 
verting x,„ at each iteration while keeping the norm Y^fiiSii^^^ 
that defines the unit of the energy scale, constant. Note how- 
ever that some of the chemical potentials must be kept fixed 
during the sampling (specifically those belonging to the set of 
compounds for which a priori information is required). In this 
case, the expression for a given above guarantees that the el- 
ementary step of the algorithm is proportional to the amount 
by which the constraint is violated. We found empirically that 
for this choice the convergence time decreases by a factor of 
about 10 with respect to the case in which a constant a is em- 
ployed. 

In the implementation of Method (b) we chose the value 
-23'p„/2(^^°(p))^ for /3 (for the same reasons as above), 
while keeping a constant learning rate for chemical poten- 
tials (a - 0.001). This was motivated by the need to try to 
keep a "timescale" separation between the dynamics of fluxes 
and that of chemical potentials, with the latter preferentially 
slower than the former In essence, this emphasizes the role 
of correlations in building up the solutions while potentially 
limiting the exploration of states to regions in the space of 
chemical potentials that are not too far from the initial states. 

Finally, we remark that the solution space spanned by the 
completely reversible Von Neumann constraints may lack 
convexity f or p < 1 |26j. This leads to a certain rejection 
rate of solutions (roughly 10%), which represents a negligible 
additional cost in computations. 



Appendix D: Further results from Method (b) 



Fig. 11 shows the flux-flux correlations computed from 



Method (b) assuming strong flux through the ATPase pump. 
Compared to the case where ATPase flux is weak, one notices 
that glycolysis is split in two separate but correlated modules, 
formed respectively by the first six and the last three reactions 
(in agreement with |12|). Similarly, the pentose phosphate 
pathway is split in two tight modules, while the nucleotides 
salvage pathway no longer forms a compact group. Note that 
the role of PGK and PGI as switches for the ATPase and NAD- 
PHase pumps is much less pronounced in this case. 

The average production profile computed by Method (b) is 
displayed in Fig. 12 One sees that all intracellular metabo- 
lites are mass-balanced except PYR, LAC, H^ and HCO3. 
This means that VNC solutions predict a slow steady growth 
of their concentration. However all of these are subject to 
outtakes, therefore the final state of the cell is globally mass 
balanced (i.e. the solutions of thermodynamically-constraint 
VNC would coincide with those of thermodynamicaUy- 
constraint MBE on the same system). 

The rank plot of the eigenvalues of the chemical potential 
correlation matrix K computed from Methods (a) and (b) is 
reported in Fig. 



1 3 One clearly sees that Method (b) outper- 
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FIG. 1 1 Pearson coefficients of the fluxes obtained by Method (b) 
with rich input information with strong ATPase flux. 



State 


TS(m) 


IV(fW) 


U{fW) 


n 


average eff., weak ATPase flux 


226 


51 


278 


0.18 


average eff., strong ATPase flux 


235 


67 


302 


0.22 


maximal eff., weak ATPase flux 


256 


92 


348 


0.264 


maximal eff., strong ATPase flux 


268 


112 


381 


0.295 



TABLE IV Summary of the energy balance of the human red blood 
cell metabolic network 



Reaction 


Flux [M/s] 


HK 


4.3 ± 0.7 • 10-" OH 


GSSGR 


5 + 1 • 10-^ |32) 


RLS 


14-10-' (STl 


AMPcat 


3 • 10-" m 



TABLE V Measured fluxes in the red blood cell. GSSGR stands for 
glutathione reductase; RLS for Rapoport-Luebering shunt; AMPcat 
is AMP catalysis. 




forms Method (a) in gaining information over the trial distri- 
bution Poig). Note in particular that Method (a) typically loses 
information on metabolites with small variability range in the 
trial distribution (coiTesponding to the smallest eigenvalues of 
K). 



Finally, Table IV displays a summary of the chemical en- 



ergy balance of the hRBC metabolism. 



FIG. 12 Average production profile obtained by jointly sampling 
fluxes and chemical potentials with ATPase ionic pump active. Note 
that fluxes are measured in units of GLC uptake. 



Input information 
Mettiod (a) 
Mettiod (b) 




FIG. 13 Rank plot of the eigenvalues of the correlation matrix of 
chemical potentials for Po(g) (black), and for the distributions ob- 
tained via Methods (a) (red) and (b) (green), for rich input informa- 
tion. Note that the ^i-scale is logarithmic. 



Appendix E: Approximate energy balance analysis. 

One of the results emerging from the sampling of fluxes 
and chemical potentials is that the overall flux through the nu- 
cleotide rescue pathway is about one order of magnitude lower 
than that going through glycolysis and/or pentose phosphate 
pathway. This is in agreement with previous results based on 
MBE and VNC without explicit thermodynamic constraints 
|24| and the known experimental values (see Table [V|. 

It is interesting to notice that in order to maintain the ase 
pumps the activation of glycolysis and pentose phosphate 
pathway is sufficient. To see this, we can work out the station- 
ary state of the network in absence of the nucleotide salvage 
pathway, i.e. for a reduced network consisting of 13 (glycol- 
ysis) + 8 (PPP) + 3 (pumps) = 24 intracellular reactions and 
5 uptakes (GLC, PYR, LAC H2O and HCO3). The mass bal- 
ance equations reveal that only four of the above reactions 
are linearly independent: we choose mglc^ mlac (glucose and 
lactate uptake), vg6pdh and vrls (RLS stands for Rapoport- 
Luebering shunt). All fluxes can be written in terms of these. 
In particular, the quantities of interest in the energy balance 
take the form 



,GLC 



„PYR\ 



U - «GLc(^^i,^ - 28^^,'^,) + «LAc(^;;^)i - 8l\p+ 



+ VG6PDH(l/3g(„,) ■ 



■28 



H2O 

(ext) 



HCO 



^SD (El) 



^(ext) 
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for the energy flow and [18 

T^ATPase = 2mgLC " Vg6PDh/3 - VrlS (E2) 

VNADHase - 2mgLC - VG6PDH + "LAC (E3) [19 

VNAPDHase - ^GSSGR = 2vg6PDH (E4) 

for the fluxes through the pumps. Now under flux balance 
mglc = I'HK- Hence from the values in Table [V| we can es- '- 
timate VAXPase - 3 ■ 10"^ M/s, i.e. one order of magnitude 
smaller than the glucose uptake as in [24| and in agreement r2i 
with our thermodynamic sampling without constraints on the 
ATP ionic pump. [22 
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